Introduction

This tutorial demonstrates the BioGSP (Biological Graph Signal Processing) package for analyzing spatial patterns in multiplexed imaging data. We’ll use the built-in toy CODEX dataset to showcase the package’s capabilities including:

  • Multiple kernel family options: Mexican hat, Meyer, and heat kernel families
  • Spatial pattern decomposition: Multi-scale analysis of cell type distributions
  • Graph construction and analysis: Eigenvalue analysis and graph properties
  • Cross-cell type correlation: Advanced correlation measures
  • Comprehensive visualization: Filter banks, decompositions, and spatial patterns

Load and Explore the Toy Dataset

## 'data.frame':    18604 obs. of  5 variables:
##  $ cellLabel  : chr  "ROI_0_BCL6BCell_1" "ROI_0_BCL6BCell_2" "ROI_0_BCL6BCell_3" "ROI_0_BCL6BCell_4" ...
##  $ Y_cent     : num  64.5 71.2 78.3 50.8 52.6 ...
##  $ X_cent     : num  112 82.6 91.4 92.6 98.8 ...
##  $ Annotation5: chr  "BCL6- B Cell" "BCL6- B Cell" "BCL6- B Cell" "BCL6- B Cell" ...
##  $ ROI_num    : chr  "ROI_0" "ROI_0" "ROI_0" "ROI_0" ...
##                 cellLabel   Y_cent    X_cent  Annotation5 ROI_num
## ROI_0.1 ROI_0_BCL6BCell_1 64.52399 111.96648 BCL6- B Cell   ROI_0
## ROI_0.2 ROI_0_BCL6BCell_2 71.22437  82.56243 BCL6- B Cell   ROI_0
## ROI_0.3 ROI_0_BCL6BCell_3 78.33008  91.44301 BCL6- B Cell   ROI_0
## ROI_0.4 ROI_0_BCL6BCell_4 50.83888  92.60675 BCL6- B Cell   ROI_0
## ROI_0.5 ROI_0_BCL6BCell_5 52.64693  98.76092 BCL6- B Cell   ROI_0
## ROI_0.6 ROI_0_BCL6BCell_6 56.20720  91.39930 BCL6- B Cell   ROI_0
Summary of Cell Types in Toy CODEX Data
Annotation5 Count Mean_X Mean_Y SD_X SD_Y
CD4 T 4092 45.64 45.52 21.49 23.61
BCL6- B Cell 3719 55.43 45.64 22.96 20.28
CD8 T 3346 47.38 48.53 23.91 27.34
DC 2233 50.17 46.15 23.18 24.00
CD4 Treg 1490 53.46 40.06 23.25 19.53
M1 1490 46.59 54.36 21.19 22.62
BCL6+ B Cell 931 36.53 51.97 21.03 23.56
Endothelial 746 51.29 45.41 22.90 19.33
M2 370 56.67 37.24 28.21 19.59
Myeloid 186 47.60 39.69 21.09 20.63
Other 1 107.12 46.12 NA NA
## 
## Target cell types for SGWT analysis:
## BCL6- B Cell: 3719
## CD4 T cells: 4092
## 
## ROI summary:
## 
##  ROI_0  ROI_1 ROI_10 ROI_11 ROI_12 ROI_13 ROI_14 ROI_15  ROI_2  ROI_3  ROI_4 
##    952    945   1315   1045   1190   1174   1497   1316   1155   1422   1097 
##  ROI_5  ROI_6  ROI_7  ROI_8  ROI_9 
##   1420   1059    958   1114    945
## 
## Cell types by ROI:
##         
##          BCL6- B Cell CD4 T CD8 T  DC  M1
##   ROI_0           190   209   171 114  76
##   ROI_1           189   208   170 113  76
##   ROI_10          263   289   237 158 105
##   ROI_11          209   230   188 125  84
##   ROI_12          238   262   214 143  95
##   ROI_13          235   258   211 141  94
##   ROI_14          299   329   269 180 120
##   ROI_15          263   290   237 158 105
##   ROI_2           231   254   208 139  92
##   ROI_3           284   313   256 171 114
##   ROI_4           219   241   197 132  88
##   ROI_5           284   312   256 170 114
##   ROI_6           212   233   190 127  85
##   ROI_7           191   211   172 115  77
##   ROI_8           223   245   200 134  89
##   ROI_9           189   208   170 113  76

BioGSP_Tutorial.R

Visualize Spatial Distribution

BioGSP_Tutorial.R

Kernel Family Comparison

## Demonstrating different kernel family options available in SGWT...

## 
## Available kernel family types:
## 1. mexican_hat
## 2. meyer
## 3. heat
## 
## Each kernel family provides:
## - mexican_hat: Gaussian scaling + LoG-style wavelet
## - meyer: Smooth cosine low-pass + band-pass transitions
## - heat: Exponential decay scaling + derivative-like wavelet
## 
## Kernel values at different eigenvalues:
##            x mexican_hat_scaling mexican_hat_wavelet meyer_scaling
## 1 0.00000000           1.0000000        0.0000000000             1
## 2 0.01507538           0.9998864        0.0002272412             1
## 3 0.03015075           0.9995456        0.0009086548             1
## 4 0.04522613           0.9989778        0.0020433121             1
## 5 0.06030151           0.9981835        0.0036296666             1
## 6 0.07537688           0.9971632        0.0056655569             1
## 7 0.09045226           0.9959176        0.0081482106             1
## 8 0.10552764           0.9944474        0.0110742486             1
##   meyer_wavelet heat_scaling heat_wavelet
## 1             0    1.0000000   0.00000000
## 2             0    0.9850377   0.01484981
## 3             0    0.9702992   0.02925525
## 4             0    0.9557813   0.04322629
## 5             0    0.9414806   0.05677270
## 6             0    0.9273939   0.06990406
## 7             0    0.9135179   0.08262976
## 8             0    0.8998496   0.09495900

BioGSP_Tutorial.R

Create Binned Data for SGWT Analysis

## Binned data created successfully!
## BCL6- B Cell bins with cells: 157 out of 900
## CD4 T bins with cells: 171 out of 900

BioGSP_Tutorial.R

SGWT Analysis

Apply SGWT Analysis

## Applying SGWT to BCL6nB cell distribution...
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 0.4886, 0.2443, 0.1221 
## Performing SGWT decomposition...
## Reconstruction RMSE: 0.11396
## SGWT analysis completed for BCL6nB
## Reconstruction RMSE: 0.11396
## 
## Applying SGWT to CD4T cell distribution...
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 0.4886, 0.2443, 0.1221 
## Performing SGWT decomposition...
## Reconstruction RMSE: 0.149172
## SGWT analysis completed for CD4T
## Reconstruction RMSE: 0.149172

BioGSP_Tutorial.R

Compare Different Kernel Families

## Testing different kernel families on BCL6nB data...
## Testing mexican_hat kernel family...
## RMSE with mexican_hat kernel: 0.11396 
## Testing meyer kernel family...
## RMSE with meyer kernel: 0.11527 
## Testing heat kernel family...
## RMSE with heat kernel: 0.113988
Reconstruction Error Comparison Across Different Kernel Families
Kernel_Family RMSE Description
mexican_hat mexican_hat 0.113960 Gaussian scaling + LoG wavelet
meyer meyer 0.115270 Smooth cosine transitions
heat heat 0.113988 Exponential decay + derivative

BioGSP_Tutorial.R

SGWT Decomposition Visualization

# Function to visualize SGWT components
plot_sgwt_components <- function(sgwt_result, data_in, cell_type, color_val) {
  coefficients <- sgwt_result$decomposition$coefficients
  plot_data <- data_in
  
  plots <- list()
  
  # Original signal
  plot_data$original <- sgwt_result$original_signal
  p_orig <- ggplot(plot_data, aes(x = x, y = y, fill = original)) +
    geom_tile() +
    scale_fill_gradient2(low = "white", high = color_val) +
    labs(title = paste(cell_type, "- Original")) +
    theme_void() + scale_y_reverse() + coord_equal() +
    theme(legend.position = "none")
  plots[["original"]] <- p_orig
  
  # Scaling function
  plot_data$scaling <- Re(as.vector(coefficients[[1]]))
  p_scaling <- ggplot(plot_data, aes(x = x, y = y, fill = scaling)) +
    geom_tile() +
    scale_fill_gradient2(low = "blue", mid = "white", high = "red") +
    labs(title = "Scaling (Low-freq)") +
    theme_void() + scale_y_reverse() + coord_equal() +
    theme(legend.position = "none")
  plots[["scaling"]] <- p_scaling
  
  # Wavelet coefficients
  for (i in 1:min(3, length(coefficients) - 1)) {
    coeff_name <- paste0("wavelet_", i)
    plot_data[[coeff_name]] <- Re(as.vector(coefficients[[i + 1]]))
    
    p_wavelet <- ggplot(plot_data, aes_string(x = "x", y = "y", fill = coeff_name)) +
      geom_tile() +
      scale_fill_gradient2(low = "blue", mid = "white", high = "red") +
      labs(title = paste("Wavelet", i)) +
      theme_void() + scale_y_reverse() + coord_equal() +
      theme(legend.position = "none")
    
    plots[[coeff_name]] <- p_wavelet
  }
  
  # Reconstructed signal
  plot_data$reconstructed <- sgwt_result$reconstructed_signal
  p_recon <- ggplot(plot_data, aes(x = x, y = y, fill = reconstructed)) +
    geom_tile() +
    scale_fill_gradient2(low = "white", high = color_val) +
    labs(title = "Reconstructed") +
    theme_void() + scale_y_reverse() + coord_equal() +
    theme(legend.position = "none")
  plots[["reconstructed"]] <- p_recon
  
  return(plots)
}

# Plot BCL6nB decomposition
bcl6nb_plots <- plot_sgwt_components(sgwt_bcl6nb, bcl6nb_sgwt_data, "BCL6nB", 
                                     color_mapping_vector["BCL6nB"])
bcl6nb_combined <- wrap_plots(bcl6nb_plots, ncol = 3)
print(bcl6nb_combined)

BioGSP_Tutorial.R

Cross-Cell Type Correlation Analysis

## Performing cross-correlation analysis between BCL6nB and CD4T...
## Graph Cross-Correlation between BCL6nB and CD4T: 0.9412
## Cosine similarity between BCL6nB and CD4T: 0.3414
## Pearson correlation between BCL6nB and CD4T: 0.2188
Cross-Cell Type Correlation Analysis: BCL6nB vs CD4T
Method Value Description
Graph Cross-Correlation (GCC) 0.9412 Frequency-domain correlation using graph structure
Cosine Similarity 0.3414 Geometric similarity measure
Pearson Correlation 0.2188 Linear correlation coefficient

BioGSP_Tutorial.R

Cross-ROI Analysis

## === Demonstrating Cross-ROI Analysis ===
## ROI ROI_1 - BCL6- B Cell: 189 
##   X range: 28.14 59.85 
##   Y range: 76.06 106.72 
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 0.5156, 0.2578 
## Performing SGWT decomposition...
## Reconstruction RMSE: 0.1359 
##   SGWT RMSE: 0.1359 
##   Active bins: 109 out of 225
## 
## ROI ROI_2 - BCL6- B Cell: 231 
##   X range: 49.92 87.46 
##   Y range: 13.05 46.97 
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 0.5156, 0.2578 
## Performing SGWT decomposition...
## Reconstruction RMSE: 0.099712 
##   SGWT RMSE: 0.0997 
##   Active bins: 99 out of 225
## 
## ROI ROI_3 - BCL6- B Cell: 284 
##   X range: 36.21 95.79 
##   Y range: 16.7 70.47 
## Building graph from spatial coordinates...
## Computing Laplacian and eigendecomposition...
## Auto-generated scales: 0.5156, 0.2578 
## Performing SGWT decomposition...
## Reconstruction RMSE: 0.074979 
##   SGWT RMSE: 0.075 
##   Active bins: 96 out of 225
Cross-ROI BCL6- B Cell Analysis Comparison
ROI BCL6_B_Cells Active_Bins SGWT_RMSE
ROI_1 ROI_1 189 109 0.1359
ROI_2 ROI_2 231 99 0.0997
ROI_3 ROI_3 284 96 0.0750
## Cross-ROI analysis demonstrates how spatial patterns vary across regions!

BioGSP_Tutorial.R

Summary

## === SGWT Analysis Summary ===
## Dataset: Multi-ROI Toy CODEX spatial data
## - Total cells analyzed: 18604
## - Number of ROIs: 16
## - ROI_1 focused analysis (30x30 grid): 900 bins
## - BCL6nB active bins (ROI_1): 157
## - CD4T active bins (ROI_1): 171
## 
## SGWT Analysis Results:
## - BCL6nB reconstruction RMSE: 0.11396
## - CD4T reconstruction RMSE: 0.149172
## - Number of scales: 3
## - Kernel type: mexican_hat
## 
## Cross-Correlation Results:
## - Graph Cross-Correlation: 0.9412
## - Cosine Similarity: 0.3414
## - Pearson Correlation: 0.2188
## 
## ✓ Moderate to strong spatial correlation detected between cell types
## 
## ✓ Tutorial completed successfully using the BioGSP package!

BioGSP_Tutorial.R

Conclusion

This tutorial demonstrated the key capabilities of the BioGSP package:

  1. Data Integration: Using built-in toy CODEX data for reproducible analysis
  2. Kernel Families: Comparing Mexican hat, Meyer, and heat kernel families
  3. Spatial Analysis: Multi-scale decomposition of spatial cell patterns
  4. Graph Analysis: Construction and validation of spatial graphs
  5. Correlation Analysis: Advanced measures for cross-cell type relationships

The BioGSP package provides a comprehensive framework for spatial analysis of multiplexed imaging data, enabling researchers to uncover multi-scale spatial patterns and relationships in complex tissue environments.

For more information, see ?codex_toy_data for dataset details and explore other BioGSP functions using help(package = "BioGSP").